Understanding Content Lifecycle Management in VIDIZMO
To help users reduce storage costs and better organize their content, VIDIZMO gives you the capability to create content lifecycle policies. Using this functionality, you create policies for deleting, purging or changing the Azure access tier for your Portal content according to your requirements.
Content lifecycle policies help you efficiently manage your storage and Portal content by allowing you to relocate or remove content based on criteria you define. These policies can also be used to move content to different Azure access tiers, each with varying storage and access costs. The Azure access tiers are Hot, Cold, and Archive.
These policies are scalable and can be used to manage content at any volume, making them a powerful tool for reducing storage and access costs on your Portal.
About Content Lifecycle Policies
The purpose of the policies is so that the users can manage their Portal content and find a way to reduce the costs of storing data especially at a large scale and if it's ever increasing in volume. It also addresses another fundamental issue users face: the cost of storing or retaining infrequently accessed data that they must keep to comply with data retention policies.
Keeping all this data in a high-speed storage is costly due to its high access costs. Therefore, using content lifecycle policies, users now have the option to move this infrequently accessed data to a storage tier that has low storage or access costs. For instance, they can put the infrequently accessed data in the cold tier or the compliance type data into the archive tier.
It is also worth mentioning that you can only utilize the content lifecycle policies to store content in different access tiers if your configured storage provider in VIDIZMO supports tiering
The content lifecycle policies also allow you to specify or narrow down the type of portal content or data you want to move to a specific access tier by giving you an option to create a highly specified selection criteria. This lets you target exactly the type of content you want to impact and saves you from the tedious task of manually selecting content and changing their tier, which can also lead to mistakes or you missing some files unintentionally, creating a content lifecycle policy lets you target content matching the criteria across your Portal.
In addition to this, you can configure these policies to be either run manually with the click of a button, or automatically via a continuous workflow.
Change Tier Policies
Change tier policies are only active and appear as a policy type for selection if you have an Azure storage (that supports tiering) configured as your Portal’s storage provider. When these policies are run or executed, the content matching their selection criteria is moved to an access tier that is selected by you. You can use these policies to move your Portal content into and from Hot, Cold or Archive tiers. These policies can also be used if you want to unarchive or rehydrate your content.
Soft Delete Policies
Users can also create Soft Delete policies for their Portal content to delete the specified data matching the criteria defined in their content lifecycle policy. When a soft delete policy is run, the targeted content is removed from your Portal library and then goes into the Recycle Bin, which acts as a retention area for the content before it is permanently purged.
Note: If you have a storage Azure configured as the storage provider for your Portal, then the content moving to the Recycle Bin due to the execution of a soft delete policy retains its access tiers.
Purge Policies
You can create a purge policy that permanently deletes or removes the data present in the Recycle bin. However, a purge policy will only purge content that matches the defined purge policy criteria and if it's also present in the recycle bin. The scope of the purge policy is only limited to the content that is present inside of the recycle bin.
Note: When a Portal is created, a Default Purge Policy will also be created and be present as part of the content lifecycle policy feature.
Policy Execution
To enable the execution of the content lifecycle policies on your Portal, you need to turn on the toggle for ‘Tenant Policy Workflow’ present on the ‘Content Lifecycle Page’ in your ‘Portal Settings.’ This toggle acts as a switch for the entirety of the execution of all policies across the Portal.
Note: You also need to enable this workflow if you want to manually change tiers for your Portal content without using a content policy.
When this toggle is turned on, it initializes a ‘sticky’ workflow (or a workflow that runs continuously) which monitors for what policy needs to be run now or next. Unlike other workflows, the continuous workflow does not ‘finish’ or is cancelled out and runs or is active continuously until the toggle is turned off. The continuous workflow is also responsible for executing policies regardless of them being automatic or manual.
When this workflow is active, it analyzes and queues up all the jobs or tasks for policy execution and executes them accordingly. This also implies that the workflow will continue if it is interrupted. To elaborate, if the workflow was switched off or was not able to finish during the execution of a policy, then it will continue from the point it was cancelled. It will not move to the next job until it finishes the previous job it was currently doing at that time.
Workflow Execution States
You can determine the status of the workflow indicated by the status on the content lifecycle policy page. Here is the following status that the continuous workflow can have:
- Running: This status indicates that the workflow is in process, running and executing policies as soon as it receives the requests according to the configurations and prioritization of the content lifecycle policies.
- Pending: The pending status indicates that the workflow is being initialized for policy execution. This occurs after the toggle has been turned on which indicates that it has received the request to begin or be
- Failed: Indicates that the workflow has encountered an error during the policy execution and was not able to complete or fulfill the execution of the content lifecycle policy.
- Cancelling: Indicates that the workflow has received the request to halt or terminate its execution.
- Canceled: Indicates that the execution of the workflow has been terminated or stopped.
- Finished: This status is shown after the toggle has been turned off after a complete execution of policy.
One thing to note about the continuous workflow for policy execution is that when it is turned off and then turned on again it will continue executing the policy according to the last request it received, it will not move onto the execution of the next policy until the request for the previous policy has been completed and satisfied, this occurs despite or regardless of the prioritization of the content lifecycle policies.
Priority of Policy Execution
You can determine the execution type of the content policy you have created from its configuration; it can either be set to manual or automatic. When a policy is configured for manual execution, it can only be executed or initialized by the click of a button present on it while you are on the content lifecycle policy page. Manual policy execution takes priority over the next policy that is to be executed. For instance, if the continuous workflow is already in the process of running a policy and another second policy is queued after it, then when the user prompts a policy to be executed manually then the continuous workflow will complete the execution of the policy that was currently in progress and then execute the policy executed manually instead of the second policy that was queued.
When it comes to the policies that are set to automatic, then they are queued up according to order id on the content lifecycle policy page, you can determine which policies should take precedence or more priority by dragging and reordering the policies. The automation execution priority is determined by the order id of the content lifecycle policy.
The policy with the lowest order ID or the one that is present at the top has the highest priority, the rest of the policies are then processed afterwards according to their order ID, with the priority decreasing from top to bottom. It is also worth mentioning that the automatic policies are queued according to the scheduler service which is run after a time interval that you can configure on the content lifecycle policy page.
Flow for the Tenant Policy Workflow
- User turns the toggle on.
- VIDIZMO WebApp facilitates this request.
- Requests are sent through Web API.
- DLL Stack is utilized by the request.
- Request is sent as a message or event to the broker.
- The various services in VIDIZMO act as subscribers to the Broker (they are subscribed to the topics) and they are listening for certain events.
- When the initial request for the workflow is received, the workflow is activated for a single Portal.
- The workflow is continuous and constantly listens for certain Events occurring for that single Portal.
- It queues all the requests for automatic activities, and it listens for Events that occur in cases of manual execution of policies or manual change tier activities.
- The activities on the workflows are carried out if these events are found. The workflow facilitates it.
- The workflow keeps on going until it stops (the toggle is turned off by the user).
The advantage of using this workflow is that we are making less calls to Azure, the policy is optimized and are not reliant on the Event based architecture as it would result in the utilization of too many resources. Every time the policy is executed its logs are also maintained in its ‘History’ page; the details are logged with the help of a service in VIDIZMO.
Note: The workflow pauses for a set time between the executions of automatic policies. You can adjust this duration via the workflow settings on the content lifecycle policy page.
Azure Access Tiers
VIDIZMO allows you to reduce storage costs by letting you store your content in Azure access tiers if you have an Azure storage provider configured that supports tiering. The access tiers you can select are Hot, Cool, and Archive. The access tiers differ in terms of storage costs, access costs, and more. Each access tier is optimized to hold data of specific importance. You determine the importance of data and decide which access tier it belongs to accordingly.
Hot Tier
The Hot tier specializes in containing data that is active and accessed often. This storage tier has the highest storage costs but the lowest access costs, making it ideal for containing data used frequently.
Cool Tier
The Cool Tier is where you keep data that is accessed rarely or infrequently. The Cool tier has lower storage costs but higher access costs than the Hot storage. Cool storage is ideal for holding data temporarily and creating short-term backups.
You must store and retain the data in the Cool tier for at least 30 days. If you either delete or move the data to a different tier before this duration, you incur an early deletion fee.
Archive Tier
The Archive Tier contains data that is rarely accessed or used but is important enough to be stored and maintained for extended periods. The archive tier is an offline access tier, which means that data inside of it can't be read or modified.
To read or modify the data, you must first unarchive the data and move it to either the Hot or Cool tier by a process known as Rehydration (or Blob Rehydration). The archive tier also has a higher data retrieval cost compared to Hot and Cool storage but significantly reduced storage costs out of all the Azure access tiers.
Like the Cool Tier, data in the Archive tier is subject to an early deletion fee if it is either deleted or moved to a different tier 180 days before.
Rehydration
The process of unarchiving involves moving content from the Archive tier to another access tier. The time taken to unarchive your data depends on the size of the data and the type of unarchiving priority used. You can unarchive your data using two types of priorities.
Standard Priority
Standard priority is the default priority for unarchiving data. It enables you to unarchive 10 GB of data within 15 hours. Standard unarchiving commands are processed in the order they are received.
High Priority
Unarchiving your data with high priority is faster and costs more in comparison to standard priority. You can unarchive 10 GB of data in the timeframe of 1 to 3 hours. High priority unarchiving commands also take priority over standard ones.
If high-priority unarchiving of data less than 10 GB takes more than 5 hours, you’ll be charged at the standard priority rate instead.
Azure Access Tier Handling in VIDIZMO
When configuring your Azure storage provider, you have the option to set a default access tier for new content uploaded or generated on your Portal. You can also select an option where VIDIZMO asks you for the access tier every time new content is uploaded, or created via copying, clipping, processing or more. The default tier is also active in this case as it helps handle situations where content is being continuously relocated to your Portal, like in content ingestion.
Processing content for an activity, such as transcoding or redaction, does not affect its access tier. However, if a processing activity yields new content, such as a copy or metadata files, then the results vary.
Files Created Post-Processing
If a processing activity, such as AI Insights like Transcriptions or Detections, yield metadata or timed data files, then these files will inherit the access tier of the base Portal content or file that was processed (except for Thumbnails which we will discuss further on).
However, if a processing activity is creating a copy of the base file, like when you perform a ‘Redact as New,’ then these files will go to the access tier of the new copy that is created.
If the option to ‘Ask Tier’ is turned on in your storage provider settings, then you can determine which access tiers these copies will go into. Read more here: Utilizing Azure Access Tiers in VIDIZMO.docx
Example
Let us understand this with an example where the default tier is Cool, and the option that asks you for the access tier is disabled.
If you process a video in the Hot tier for a Redaction activity, then it remains in the Hot tier after its processing is complete. The metadata files generated from this Redaction activity go to the Hot tier, as it is the access tier of the result.
However, if you process a video in the Hot tier with the 'Redact as New' activity (which creates a redacted copy of the base video), then the resulting redacted copy goes to the Cool tier as it is the default tier. The metadata files generated from this activity go to the Cool tier as well.
Storing Thumbnails
The VIDIZMO Application creates thumbnails for all content present on the Portal. Thumbnails always go in the Hot tier regardless of which access tier contains the original or base content. Even content present in the Archive Tier has its thumbnails in the Hot tier.
The idea behind storing thumbnails in the Hot tier is to make your content easily identifiable regardless of where it is present on the Portal. This functionality enables you to view content present in the Archive tab, even though files in the archived tier can't be read or modified.
The thumbnails created for images and videos are only done via the browsers. Here is a list of browsers and formats that support thumbnail generation in VIDIZMO.
Formats | Browser Support |
---|---|
.mp4 | Chrome, Firefox, Safari, Edge, Opera |
.webm | Chrome, Firefox, Opera, Edge (newer versions) |
.ogg | Firefox, Chrome, Opera |
.mov (quicktime) | Primarily Safari; other browsers may require plugins or specific codecs. |
.mpeg | Generally supported, but less commonly used |
Note: Users who are using non-browser clients, such as Desktop App, will not be able to generate thumbnails when they directly upload to the archive tier.
Content in Folders and Cases
The content inside a folder and case inherits the access tier of its parent or root. This functionality applies to content being moved or created in a folder as well. For instance, a Hot tier image changes its access tier to Cool when you move it to a Cool tier folder. A folder (or case) and the files inside always have the same access tier.
The VIDIZMO application informs and asks you for confirmation if a file's access tiers are affected when it's being copied or moved to a folder or case.
Like all content, you can manually change the access tier of a folder or case from the Portal library. However, the tier change can't happen if any content inside is either locked or in a processing state.
Content Migration
VIDIZMO allows you to configure multiple storage providers for your Portal. You can also migrate your content from one storage provider to another. VIDIZMO caters to scenarios when you migrate content to and from a storage provider that supports access tiers from other types of storage providers. Learn how VIDIZMO handles these migration scenarios, here at this link Utilizing Azure Access Tiers in VIDIZMO.docx.
Locking to Prevent Policy Action
You can prevent any policy from acting on your files by setting the files to a locked state. Locking ensures that you don’t purge, delete, or archive content that you don’t want modified.
In VIDIZMO, you can selectively lock content that’s present in your Portal’s library. Another way to lock content is via the Preview screen when you’re creating or editing a content policy.
For a more in-depth look into locking and how to do it, refer to: Use Lock in VIDIZMO to Restrict Content Policies